{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Updatable Linked Model - Tiny Drawing Classifier\n",
    "\n",
    "This notebook creates a model which can be used to train a simple drawing / sketch classifier based on user examples. The model is a 'linked' pipeline composed of a 'linked' drawing embedding model and a nearest neighbor classifier. \n",
    "\n",
    "The model is updatable and starts off 'empty' in that the nearest neighbor classifier has no examples or labels. Before updating with training examples it predicts 'unknown' for all input.\n",
    "\n",
    "The input to the model is a 28 x 28 grayscale drawing. The background is expected to be black (0) while the strokes of the drawing should be rendered as white (255). For example:\n",
    "\n",
    "| Drawing of a Star | Drawing of a Heart | Drawing of 5 |\n",
    "| ----------- | ----------- | ----------- |\n",
    "| ![Star Example](images/star28x28.png) | ![Heart Example](images/heart28x28.png) | ![Five Example](images/five28x28.png) |\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Embedding\n",
    "\n",
    "Let's start by getting the first part of the model.  It is the drawing embedding model which will be used as a feature extractor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "input {\n",
      "  name: \"drawing\"\n",
      "  shortDescription: \"Input sketch image with black background and white strokes\"\n",
      "  type {\n",
      "    imageType {\n",
      "      width: 28\n",
      "      height: 28\n",
      "      colorSpace: GRAYSCALE\n",
      "    }\n",
      "  }\n",
      "}\n",
      "output {\n",
      "  name: \"embedding\"\n",
      "  shortDescription: \"Vector embedding of sketch in 128 dimensional space\"\n",
      "  type {\n",
      "    multiArrayType {\n",
      "      shape: 128\n",
      "      dataType: FLOAT32\n",
      "    }\n",
      "  }\n",
      "}\n",
      "metadata {\n",
      "  shortDescription: \"Embeds a 28 x 28 grayscale image of a sketch into 128 dimensional space. The model was created by removing the last layer of a simple convolution based neural network classifier trained on the Quick, Draw! dataset (https://github.com/googlecreativelab/quickdraw-dataset).\"\n",
      "  author: \"Core ML Tools Example\"\n",
      "  license: \"MIT\"\n",
      "}\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import coremltools\n",
    "from coremltools.models import MLModel\n",
    "\n",
    "embedding_path = './models/TinyDrawingEmbedding.mlmodel'\n",
    "embedding_model = MLModel(embedding_path)\n",
    "\n",
    "embedding_spec = embedding_model.get_spec()\n",
    "print embedding_spec.description"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see from the description above that the embedding model takes in a 28x28 grayscale image about outputs a 128 dimensional float vector."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Linked Model\n",
    "Let's create a linked model that points to the neural network feature extractor we just created. It requires the model file name and search path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "import coremltools\n",
    "linked_model_spec = coremltools.proto.Model_pb2.Model()\n",
    "linked_model_spec.specificationVersion = coremltools._MINIMUM_LINKED_MODELS_SPEC_VERSION\n",
    "linked_model_spec.description.metadata.shortDescription = 'Linked model which points to the TinyDrawingEmbedding model.'\n",
    "\n",
    "# Input and output are the same as the model it is pointing to.\n",
    "linked_model_spec.description.input.extend(embedding_spec.description.input[:])\n",
    "linked_model_spec.description.output.extend(embedding_spec.description.output[:])\n",
    "\n",
    "fileName = coremltools.proto.Parameters_pb2.StringParameter()\n",
    "fileName.defaultValue = 'TinyDrawingEmbedding.mlmodelc'\n",
    "linked_model_spec.linkedModel.linkedModelFile.linkedModelFileName.CopyFrom(fileName)\n",
    "\n",
    "# Search path to find the linked model file\n",
    "# Multiple paths can be searched using the unix-style path separator \":\"\n",
    "# Each path can be relative (to this model) or absolute\n",
    "#\n",
    "# An empty string is the same as teh relative search path \".\"\n",
    "# which searches in the same location as this model file\n",
    "#\n",
    "# There are some special paths which start with $\n",
    "# $BUNDLE_MAIN - Indicates to look in the main bundle\n",
    "# $BUNDLE_IDENTIFIER(identifier) - Looks in Bunde with given identifer\n",
    "searchPath = coremltools.proto.Parameters_pb2.StringParameter()\n",
    "searchPath.defaultValue = '.:$BUNDLE_MAIN'\n",
    "linked_model_spec.linkedModel.linkedModelFile.linkedModelSearchPath.CopyFrom(searchPath)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Nearest Neighbor Classifier\n",
    "\n",
    "Now that the feature extractor is in place, let's create the second model of our pipeline model.\n",
    "It is a nearest neighbor classifier operating on the embedding."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "from coremltools.models.nearest_neighbors import KNearestNeighborsClassifierBuilder\n",
    "import coremltools.models.datatypes as datatypes\n",
    "\n",
    "knn_builder = KNearestNeighborsClassifierBuilder(input_name='embedding',\n",
    "                                                 output_name='label',\n",
    "                                                 number_of_dimensions=128,\n",
    "                                                 default_class_label='unknown',\n",
    "                                                 number_of_neighbors=3,\n",
    "                                                 weighting_scheme='inverse_distance',\n",
    "                                                 index_type='linear')\n",
    "\n",
    "knn_builder.author = 'Core ML Tools Example'\n",
    "knn_builder.license = 'MIT'\n",
    "knn_builder.description = 'Classifies 128 dimension vector based on 3 nearest neighbors'\n",
    "\n",
    "knn_spec = knn_builder.spec\n",
    "knn_spec.specificationVersion = coremltools._MINIMUM_NEAREST_NEIGHBORS_SPEC_VERSION\n",
    "knn_spec.description.input[0].shortDescription = 'Input vector to classify'\n",
    "knn_spec.description.output[0].shortDescription = 'Predicted label. Defaults to \\'unknown\\''\n",
    "knn_spec.description.output[1].shortDescription = 'Probabilities / score for each possible label.'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Updatable Pipeline\n",
    "\n",
    "Last step is to create the pipeline model and insert the linked model and the nearest neighbor classifier. The model will be set to be updatable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline_spec = coremltools.proto.Model_pb2.Model()\n",
    "pipeline_spec.specificationVersion = coremltools._MINIMUM_UPDATABLE_SPEC_VERSION\n",
    "pipeline_spec.isUpdatable = True\n",
    "\n",
    "# Inputs are the inputs from the linked model\n",
    "pipeline_spec.description.input.extend(linked_model_spec.description.input[:])\n",
    "\n",
    "# Outputs are the outputs from the classification model\n",
    "pipeline_spec.description.output.extend(knn_spec.description.output[:])\n",
    "pipeline_spec.description.predictedFeatureName = knn_spec.description.predictedFeatureName\n",
    "pipeline_spec.description.predictedProbabilitiesName = knn_spec.description.predictedProbabilitiesName\n",
    "\n",
    "# Training inputs\n",
    "pipeline_spec.description.trainingInput.extend([linked_model_spec.description.input[0]])\n",
    "pipeline_spec.description.trainingInput[0].shortDescription = 'Example sketch'\n",
    "pipeline_spec.description.trainingInput.extend([knn_spec.description.output[0]])\n",
    "pipeline_spec.description.trainingInput[1].shortDescription = 'Associated true label of example sketch'\n",
    "\n",
    "# Provide metadata\n",
    "pipeline_spec.description.metadata.author = 'Core ML Tools'\n",
    "pipeline_spec.description.metadata.license = 'MIT'\n",
    "pipeline_spec.description.metadata.shortDescription = ('An updatable model which can be used to train a tiny 28 x 28 drawing classifier based on user examples.'\n",
    "                                                       ' It uses a drawing embedding trained on the Quick, Draw! dataset (https://github.com/googlecreativelab/quickdraw-dataset)')\n",
    "\n",
    "# Construct pipeline by adding the embedding and then the nearest neighbor classifier\n",
    "pipeline_spec.pipelineClassifier.pipeline.models.add().CopyFrom(linked_model_spec)\n",
    "pipeline_spec.pipelineClassifier.pipeline.models.add().CopyFrom(knn_spec)\n",
    "\n",
    "# Save the updated spec.\n",
    "# Note that to use this \"linked\" pipeline, both LinkedUpdatableTinyDrawingClassifier.mlmodel and TinyDrawingEmbedding.mlmodel must be imported into the project.\n",
    "from coremltools.models import MLModel\n",
    "mlmodel = MLModel(pipeline_spec)\n",
    "\n",
    "output_path = './LinkedUpdatableTinyDrawingClassifier.mlmodel'\n",
    "from coremltools.models.utils import save_spec\n",
    "mlmodel.save(output_path)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
